Method and apparatus for dynamic DLL powerdown and memory self-refresh

ABSTRACT

Embodiments of the present invention provide a method and apparatus for conserving power in an electronic device. In particular, embodiments of the present invention dynamically place the memory in self-refresh and chipset clock circuits in power down mode while keeping the isochronous streams (such as display) updated and servicing bus master cycles in a power savings mode.

BACKGROUND

Computing devices, particularly portable devices, are frequently limited by the amount of time that they can run on battery power without reconnection to an AC power supply. Thus, there is a continuous effort to reduce the power consumption of various components of computers, including the central processing unit. Keeping electronic devices such as a central processing unit, a memory controller or a memory in their lowest possible power state provides a number of benefits. For example, it allows battery-operated machines to operate for longer periods of time between recharging. A reduction in power consumption also reduces thermal dissipation by the central processing unit. Reduced thermal dissipation allows the central processing unit to run at full speed for longer periods of time, while remaining within its thermal dissipation specifications. Reduced thermal dissipation also reduces the need for fans and other components used to prevent heat build-up in a computer.

A standard specification used in developing power management systems is the advanced configuration and power interface (ACPI) specification (for example, rev. 2.0 dated Jul. 27, 2000; see also ACPI Component Architecture Programmer Reference, rev. 1.05 dated Feb. 27, 2001 available from Intel Corporation of Santa Clara, Calif.). One goal of the ACPI is to enhance power management functionality and robustness, as well as facilitating industry wide implementation of common power management features.

The ACPI defines a number of processor power states that are processor power consumption and thermal management states within a global working state. These processor states include a (i) CØ power state, (ii) C1 power state, (iii) C2 power state, and (iv) C3 power state. In the CØ power state, the processor executes instructions and is at full power. In the C1 and C2 power states, the processor is in a non-executing power state. However, the C2 power state uses less power than the C1 state. In the C1 and C2 power state, the processor still allows the bus to snoop the processor cache memory and thereby maintain cache coherency. The C3 power state offers improved power savings over the C1 and C2 power states, but at the cost of higher power down exit latency to memory.

In conventional systems, the power management logic causes the CPU to transition from a C2 power state back to a high-powered CØ power state under certain circumstances. Keeping the electronic device in a lower power state than could otherwise be achieved and reducing the number of transitions between power states improves system performance by reducing latencies caused by switching between designated power states, as well keeping the overall power consumption lower.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a diagram of an embodiment of transitions between processor power states in the ACPI specification.

FIG. 2 illustrates a flow diagram of an embodiment of a routine for placing the memory in self-refresh and memory digital locked loops (DLLs) in power down mode while keeping the display updated and maintaining use of bus masters during the C2 power state for an integrated graphics configuration.

FIG. 3 is a diagram of an embodiment of an exemplary integrated graphics configuration for placing the memory in self-refresh and DLL in power down mode while maintaining use of bus masters and keeping the display updated during the C2 power state.

FIGS. 4(a) and (b) illustrate flow diagrams of embodiments of routines for placing the memory in self-refresh and DLLs in power down mode while maintaining use of bus masters during the C2 power state for a discrete configuration.

DETAILED DESCRIPTION

Embodiments of the present invention provide a method and apparatus for conserving power in an electronic device. In particular, embodiments of the present invention dynamically place the memory in self-refresh and chipset clock circuits in power down mode while keeping the display updated and servicing bus master cycles in a power savings mode, such as C2. Maintaining the processor in a power savings mode, such as C2, saves power and reduces the power difference between integrated and non-integrated graphics chipset platforms even when snoopable bus mastering cycles are occurring (unlike in the C3 state, for example).

In the detailed description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be understood by those skilled in the art that the present invention maybe practiced without these specific details. In other instances, well-known methods, procedures, components and circuits have been described in detail so as not to obscure the present invention.

Some portions of the detailed description that follow are presented in terms of algorithms and symbolic representations of operations on data bits or binary signals within a computer. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to convey the substance of their work to others skilled in the art. An algorithm is here, and generally, considered to be a self-consistent sequence of steps leading to a desired result. The steps include physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers or the like. It should be understood, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussions, it is appreciated that throughout the specification, discussions utilizing such terms as “processing” or “computing” or “calculating” or “determining” or the like, refer to the action and processes of a computer or computing system, or similar electronic computing device, that manipulate and transform data represented as physical (electronic) quantities within the computing system's registers and/or memories into other data similarly represented as physical quantities within the computing system's memories, registers or other such information storage, transmission or display devices.

Embodiments of the present invention may be implemented in hardware or software, or a combination of both. However, embodiments of the invention may be implemented as computer programs executing on programmable systems comprising at least one processor, a data storage system (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Program code may be applied to input data to perform the functions described herein and generate output information. The output information may be applied to one or more output devices, in known fashion. For purposes of this application, a processing system includes any system that has a processor, such as, for example, a digital signal processor (DSP), a micro-controller, an application specific integrated circuit (ASIC), or a microprocessor.

The programs may be implemented in a high level procedural or object oriented programming language to communicate with a processing system. The programs may also be implemented in assembly or machine language, if desired. In fact, the invention is not limited in scope to any particular programming language. In any case, the language may be a compiled or interpreted language.

The programs may be stored on a storage media or device (e.g., hard disk drive, floppy disk drive, read only memory (ROM), CD-ROM device, flash memory device, digital versatile disk (DVD), or other storage device) readable by a general or special purpose programmable processing system, for configuring and operating the processing system when the storage media or device is read by the processing system to perform the procedures described herein. Embodiments of the invention may also be considered to be implemented as a machine-readable storage medium, configured for use with a processing system, where the storage medium so configured causes the processing system to operate in a specific and predefined manner to perform the functions described herein.

FIG. 1 illustrates a diagram of an embodiment 100 of transitions between processor power states in the ACPI specification. All states, the CØ state 102, the C1 state 104, the C2 state 106 and the C3 state 108 are encompassed within a GO working state 110. A GØ working state is defined by the ACPI specification as a computer state where the system dispatches user mode (application) threads. In the GO working state, these threads are executed. In this state, devices (peripherals) are dynamically having their power states changed. Within this GØ state 110, a processor transitions between various processor power states including the CØ state 102, the C1 state 104, the C2 state 106, and the C3 state 108.

In the CØ state 102, the processor is at full power. In this state, the components of a typical system are powered and the clocks in the system can run at full speed. The C1 state 104 defines a non-executing state in which the processor power state has the lowest latency.

The C2 state 106 is a second non-executing power state which offers improved power savings over the C1 state 104. The C2 state 106 is a common chipset mode while a computer is in a passive state (i.e. operating system idle) and connected to bus masters such as USB devices or audio ports. During the C2 state 106, discrete chipsets access memory primarily to service bus master cycles and integrated graphics chipsets access memory primarily to fetch display refresh data, service bus master cycles or continue graphics rendering. The CPU does not need to access memory. The DRAM memory operates in an extended power conservation mode, sometimes referred to as a stand-by mode, or self refresh. A refresh unit recharges electrical cells within DRAM memory in order to maintain data integrity.

The C3 power state 108 offers improved savings over both the C1 state 104, and the C2 state 106. While in the C3 state 104, the processor's caches maintain the current information state and snoops are not possible. The processor is brought back out to the C0, C1 or C2 states to handle snoopable traffic.

The transitions between states occur from the CØ state 102 along path 112 to the C1 state 104 and back to the CØ state 102 along return path 114. Transitions also occur from the CØ state 102 to the C2 state 104 along path 116 and return to the CØ state 104 along path 118. Finally, transitions occur from the CØ state 104 along path 120 to the C3 state 116 and return to the CØ state along path 122. CPU inactivity for a sufficient duration will trigger a transition from the CØ state 102 to the C2 state 104 along path 116. A break event such as an interrupt will result in a transition of the system from the C2 state 104 along a path 118 to the CØ state 102.

It should be recognized that although the description of this system will be described according to the ACPI specifications power states of CØ, C1, C2 and C3 for convenience, the invention is not limited by the ACPI specification. In general, for embodiments not following the ACPI specification, the CØ power state is defined for purposes of this invention as a full power state in which the CPU carries on its normal functions. The ACPI C2 power state is defined generally to be an intermediate power state between full power and the C3 power state. With an Intel processor, the C2 power state is equivalent to the STOP GRANT state. In general the C2 power state allows snooping memory accesses and maintaining cache coherency.

FIG. 2 illustrates a flow diagram of an embodiment 200 of a routine for placing the memory in self-refresh and DLLs in power down mode while keeping the display updated and maintaining use of bus masters during the C2 power state for an integrated graphics configuration. Embodiments of the present invention (1) place the memory in self-refresh during idle times, rather than just in precharge power down mode and/or (2) dynamically power down the DDR clocks/DLLs. For purposes of this invention, this power savings state is referred to as “C2 self-refresh” even though more power savings are obtained than just memory going into self-refresh. In particular, since the other bus masters on the platform generally have very large latency tolerance compared to display, display updates can proceed properly as long as the buffering provided for display is sufficient to cover the maximum exit latency for memory to come out of self-refresh. If a non-isochronous bus master has started to do a very long burst to memory when a display request must be served, the completion of the bus master request can be postponed until after the display request has been serviced. As long as any isochronous streams (for example, isochronous audio) that must also get memory access are of sufficiently short burst sizes that they stay within the bounds of the other isochronous streams (for example, display) ability to handle latency, and as long as these streams request memory accesses at a rate lower than that required to exit memory self refresh, then the C2 self-refresh state can be enabled. Isochronous streams have the characteristic that their maximum burst sizes and minimum repetition rates are deterministic in the platform, so it is easy to know when the C2 self-refresh state is achievable.

In step 202, the processor is confirmed to be in the C2 power state.

In step 204, lack of memory requests from any source (bus master, display refresh) is confirmed.

In step 206, the memory burst size and display FIFO threshold level are set to predefined levels conducive to the C2 power state. In particular, as shown in FIGS. 3 and 4 and discussed in detail below, the display FIFO has a threshold level that triggers a burst request when it is reached. The FIFO threshold value is set such that the memory bursts that are required for display refresh are large enough, and spaced far enough apart in time, so that substantial power down time in the C2 power state is possible before the DDR DLLs and chipset memory need to be re-enabled. In a typical configuration for an integrated graphics configuration, display logic manages a display FIFO. The threshold value is present in a threshold register. The threshold value is programmable and pre-set depending on the power savings mode. This can save power in limiting the number of memory transfers (each of which uses power) and can create idle periods during static display in which low-power devices can enter a power-savings mode. The request burst size and threshold level control the spacing in time of these requests.

A rendering engine is confirmed or forced to be idle. The chipset is generally in a state that provides opportunities for entering the self-refresh state when graphics rendering is not required or is completed.

In step 208, any or a combination of the following can occur: 1) system memory is placed in self-refresh with clocks and other memory control signals tri-stated for the system memory, 2) memory DLLs not needed during C2 self-refresh state can be placed in power down and/or 3) any other functional block and clock trees that are not needed during C2 self-refresh state can be placed in power down. The decision about which functions can be powered down is dependent on decision logic including comparing impact to powerdown exit latency of the powerdown features versus time available. The time available depends on the maximum latency tolerated by the display and the isochronous stream periodicity and burst size requirements.

Memory DLLs may be placed in power down mode. In particular, integrated circuits such as DDR DRAMs often generate a plurality of synchronized DLL outputs (phases) and utilize a plurality of operation modes, such that the output signals produced by a circuit such as a DLL are selectively applied to circuits in the device to reduce unnecessary power consumption. In a typical implementation, the power management unit controls a clock generator that clocks other chips in the system, such as the processor, memory controller and memory. Integrated circuits, such as DDR DRAMS, typically include DLLs that provide distributed signals, e.g., clock signals, to multiple circuits. A DLL typically receives a reference clock signal from which it generates an internal clock signal, the phase of which typically depends on the reference clock signal. DLLs are of some complexity and operate at high frequency, hence consume significant power. It may be desirable to operate a large number of circuits in synchronism with such an internal clock signal. If these circuits are driven in common, the total output load on the DLL can be very large, causing the DLL to consume a large amount of power. Thus, it is advantageous to power down the DLLs.

In step 210, until a bus master request and/or display refresh is confirmed, the self-refresh and dynamic DLL power down remains intact.

In step 212, in response to confirmation that a bus master and/or display refresh request has been executed, the system memory clock is enabled and the system memory placed in an idle mode.

In step 214, the DLLs are powered up. The chipset DLL associated with the memory being used to update display refresh will optionally be kept enabled during the C2 state.

In step 216, the system waits until the DLLs and system memory are both powered up.

In step 218, the next memory burst is executed and the routine returns to step 204. The processor remains in the C2 power state as long as there is not a break event (for example, an interrupt).

In typical implementations, the processor clock is restarted or signal to the processor de-asserted to accomplish the transition. The memory burst size and watermark levels are then set in accordance with the CØ power state requirements. During operation in the full power state, such as CØ, the memory bursts are generally smaller and spaced much closer in time, in accordance with the CØ power state. The CØ state imposes a display FIFO size that is large enough to encompass the new C2 burst size and threshold level requirements of this invention.

The above-described method of handling bus requests while the processor is in a low power state may be accomplished by a variety of different apparatus as described in detail below.

For example, FIG. 3 is a diagram of an embodiment of an integrated graphics configuration for placing the memory in self-refresh and DLL in power down mode while maintaining use of bus masters and keeping the display updated during the C2 power state, as illustrated in FIG. 2. The computer system 300 includes processor 302, graphics and memory controller 304 including graphics engine 306, memory 308, display FIFO 310, display pipeline 312 and display device 314. Processor 302 processes data signals and may be a complex instruction set computer (CISC) microprocessor, a reduced instruction set computing (RISC) microprocessor, a very long instruction word (VLIW) microprocessor, a process implementing a combination of instruction sets, or other processor device, such as a digital signal processor, for example. Processor 302 may be coupled to common bus 312 that transmits data signals between processor 302 and other components in the system 300.

Processor 302 issues signals over common bus 312 for communicating with memory 308 or graphics and memory controller 304 in order to manipulate data as described herein. Processor 302 issues such signals in response to software instructions that it obtains from memory 308. Memory 308 may be a dynamic random access memory (DRAM) device, a static random access memory (SRAM) device, or other memory device. Memory 308 may store instructions and/or data represented by data signals that may be executed by processor 302, graphics engine 306 or some other device. The instructions and/or data may comprise code for performing any and/or all of the techniques of the present invention. Memory 308 may also contain software and/or data. An optional cache memory may be used to speed up memory accesses by the graphics engine 306 by taking advantage of its locality of access. In some embodiments, graphics engine 306 can offload from processor 302 many of the memory-intensive tasks required for rendering an image. Graphics engine 306 processes data signals and may be a complex instruction set computer (CISC) microprocessor, a reduced instruction set computing (RISC) microprocessor, a very long instruction word (VLIW) microprocessor, a process implementing a combination of instruction sets, or other processor device, such as a digital signal processor, for example. Graphics engine 306 may be coupled to common bus 312 that transmits data signals between graphics engine 306 and other components in the system 300, including render cache 310 and display device 314. Graphics engine 306 includes rendering hardware that among other things writes specific attributes (e.g. colors) to specific pixels of display 314 and draw complicated primitives on display device 314. Graphics and memory controller 304 communicates with display device 314 for displaying images rendered or otherwise processed by a graphics controller 304 for displaying images rendered or otherwise processed to a user. Display device 314 may comprise a computer monitor, television set, flat panel display or other suitable display device.

Memory 308 stores a host operating system that may include one or more rendering programs to build the images of graphics primitives for display. System 300 includes graphics engine 306, such as a graphics accelerator that uses customized hardware logic device or a co-processor to improve the performance of rendering at least some portion of the graphics primitives otherwise handled by host rendering programs. The host operating system program and its host graphics application program interface (API) control the graphics engine 306 through a driver program.

FIFO 310 receives display data from graphics and memory controller 304 through data bus 318 and outputs display data to display pipeline 312 through data bus 320. Graphics and memory controller 304 decides which one of the devices should be granted access to memory 308. A part of the graphics engine controls block transfer of images to, from or within the memory 308. A memory address generator 322 is connected to graphics and memory controller 304 and display FIFO 310. The memory address generator 322 generates memory addresses to the graphics and memory controller 304. Graphics and memory controller 304 controls memory address generator 322 and display pipeline 312. The graphics and memory controller 304 instructs the memory address generator 322 when to start loading the FIFO 310. Display FIFO 310 is used for receiving and storing display data for the display device 314.

When the FIFO level is greater than the threshold value, a memory burst request for a non-display stream can be generated without harming display. Based upon the comparison of the FIFO data level against the threshold values, a control circuit issues a request to graphics and memory controller 304 for memory access so that data can be loaded into FIFO 310, as illustrated by the flowchart in FIG. 1.

FIGS. 4(a) and (b) illustrate flow diagrams of embodiments of routines for placing memory in self-refresh and DLLs in power down mode while maintaining use of bus masters during the C2 power state for a discrete configuration. A discrete chipset configuration has no graphics, and can put memory in self refresh as long as the isochronous constraints (i.e, isochronous periodicity must be greater than the powerdown exit latency) are met. A discrete graphics controller has a display stream to maintain. But a discrete graphics controller has no knowledge of C2 state.

Referring to FIG. 4(a), in one embodiment 400, the discrete graphics controller enters its local memory related powerdown modes such as self refresh state (for reference purposes, called the graphics c2 power state) (step 404) whenever there are no outstanding requests to local memory (step 402).

Referring to FIG. 4(b), in another embodiment 406, a discrete graphics controller computes the demand based on bandwidth threshold and/or duration of local memory request idleness on its local memory (step 408). In response to the demand being sufficiently low, it enters its local memory into self-refresh (step 410).

Having described the invention in accordance with the requirements of the patent statutes, those skilled in the art will understand how to make changes and modifications to the present invention to meet their specific requirements or conditions. Such changes and modifications may be made without departing from the scope and spirit of the invention as set forth in the following claims. 

1. A method for conserving power in an electronic device, comprising: automatically transitioning the electronic device into a power reduced mode of operation in response to no outstanding memory requests.
 2. The method claimed in claim 1, further comprising: automatically transitioning the electronic device into a power reduced mode of operation in response to a deterministic set of configurations being met.
 3. The method claimed in claim 2, wherein automatically transitioning the electronic device into a power reduced mode of operation in response to a deterministic set of configurations being met further comprising: placing the memory in self-refresh in response to a deterministic set of configurations being met.
 4. The method claimed in claim 3, wherein automatically transitioning the electronic device into a power reduced mode of operation in response to a deterministic set of configurations being met further comprises: placing clocks, control signals, clock trees, DLLs, or other unnecessary logic/circuits in power down mode in response to a deterministic set of configurations being met.
 5. The method claimed in claim 4, wherein automatically transitioning the electronic device into a power reduced mode of operation in response to a deterministic set of configurations being met further comprises: keeping the isochronous data updated and servicing bus master data in the reduced power mode.
 6. The method claimed in claim 5, wherein the power savings mode comprises a C2 power savings mode
 7. The method claimed in claim 5, wherein placing the memory in self-refresh in response to a deterministic set of configurations being met further comprises: determining whether the combination of isochronous and bus master data exceeds a predefined buffering threshold; and placing the memory in self-refresh in response to the combination not exceeding a predefined threshold.
 8. The method claimed in claim 7, wherein the predefined threshold covers the maximum exit latency for memory to come out of self-refresh.
 9. The method claimed in claim 8, wherein isochronous data includes display data.
 10. The method claimed in claim 8, wherein determining whether the combination of isochronous and bus master data exceeds a predefined buffering threshold further comprises: accessing parameters of isochronous and bus master data; and using parameters to precompute whether powerdown mode exit latencies fall within the predefined threshold.
 11. The method claimed in claim 10, wherein accessing parameters of isochronous and bus master data further comprises: using the bios/driver to access isochronous and bus master data parameters.
 12. The method claimed in claim 11, further comprising: representing the computation by coding of memory controller configuration registers or state machines controlling powerdown modes such as memory self refresh or DLL powerdown, or clock disabling.
 13. The method claimed in claim 12, further comprising: computing on-the-fly whether powerdown exit latency latencies fall within the predetermined threshold.
 14. The method claimed in claim 8, wherein determining whether the combination of isochronous and bus master data exceeds a predefined threshold further comprises: computing the maximum powerdown exit time in accordance with: maximum powerdown exit time=self refresh exit time+exit time implementation overhead/inefficiencies+applicable fraction of DLL powerdown exit time.
 15. The method claimed in claim 14, wherein display latency tolerance is determined in accordance with FIFO size, and display mode requirements.
 16. The method claimed in claim 15, wherein display latency tolerance is greater than the maximum powerdown exit time.
 17. The method claimed in claim 16, wherein the isochronous latency tolerance is determined by FIFO size and minimum periodicy interval requirements.
 18. The method claimed in claim 17, wherein the isochronous latency tolerance is greater than the maximum powerdown exit time.
 19. A system, comprising: a memory, and a power management logic to automatically transition an electronic device into a power reduced mode of operation in response to no outstanding memory requests.
 20. The system claimed in claim 19, wherein the power management logic automatically transitions the electronic device into a power reduced mode of operation in response to a deterministic set of configurations being met.
 21. The system claimed in claim 20, wherein the power management logic places the memory in self-refresh in response to a deterministic set of configurations being met.
 22. The system claimed in claim 21, wherein the power management logic places clocks or DLLs in power down mode in response to a deterministic set of configurations being met.
 23. The system claimed in claim 22, wherein the power management logic keeps isochronous data updated and servicing bus master data in reduced power mode.
 24. The system claimed in claim 23, wherein the power savings mode comprises a C2 power savings mode
 25. The system claimed in claim 23, wherein the power management logic determines whether the combination of isochronous and bus master data exceeds a predefined buffering threshold and places the memory in self-refresh in response to the combination not exceeding a predefined threshold.
 26. The system claimed in claim 25, wherein the predefined threshold covers the maximum exit latency for memory to come out of self-refresh.
 27. The system claimed in claim 26, wherein isochronous data includes display data.
 28. The system claimed in claim 26, wherein the power management logic accesses parameters of isochronous and bus master data, and uses parameters to precompute whether powerdown mode exit latencies fall within the predefined threshold.
 29. The system claimed in claim 28, wherein the power management logic uses bios or driver to access isochronous and bus master data parameters.
 30. The system claimed in claim 25, wherein the power management logic computes on-the-fly whether powerdown exit latency latencies fall within the predefined threshold.
 31. A machine-accessible medium including instructions that, when executed, cause a machine to: transition an electronic device into a power reduced mode of operation in response to lack of memory requests.
 32. The machine-accessible medium claimed in claim 31, further comprising: transitioning the electronic device into a power reduced mode of operation in response to a deterministic set of configurations being met.
 33. The machine-accessible medium claimed in claim 31, wherein transitioning the electronic device into a power reduced mode of operation in response to a deterministic set of configurations being met further comprising: placing the memory in self-refresh in response to a deterministic set of configurations being met.
 34. The machine-accessible medium claimed in claim 31, wherein transitioning the electronic device into a power reduced mode of operation in response to a deterministic set of configurations being met further comprises: placing clocks, control signals, clock trees, DLLs, or other unnecessary logic/circuits in power down mode in response to a deterministic set of configurations being met.
 35. A system, comprising: a memory manager to automatically transition an electronic device into a power reduced mode of operation in response to no outstanding memory requests.
 36. The system claimed in claim 35, wherein the memory manager transitions the electronic device into a power reduced mode of operation in response to a deterministic set of configurations being met.
 37. The system claimed in claim 35, wherein the power management logic places clocks or DLLs in power down mode in response to a deterministic set of configurations being met.
 38. The system claimed in claim 35, wherein the power management logic keeps isochronous data updated and servicing bus master data in reduced power mode. 